Àá½Ã¸¸ ±â´Ù·Á ÁÖ¼¼¿ä. ·ÎµùÁßÀÔ´Ï´Ù.
KMID : 1120320170030000082
Osteoporosis and Sarcopenia
2017 Volume.3 No. 0 p.82 ~ p.82
Big data and osteoporosis: Using information mining to gather representative diagnostic data
Lin Guan-Lun

Liu Yue-Tzong
Chang Sue-Ting
Yan Yuan-Horng
Hong Joshua Jen-Shin
Abstract
Introduction: Osteoporosis was a significant skeletal disease characterized by low bone mass and deterioration of bone strength, which led to a consequent increase in bone fragility and susceptibility to fracture. Osteoporosis was preventable and treatable. However, the diagnosis rate (Dx) of osteoporosis was very low. The objective of this study was to develop a method using health information technology (HIT) to analyze unstructured data such as the chief complaint (CC) in the documentation of patients¡¯ medical history the history of present illnesses (PI) in electronic medical records (EMR) for improving the Dx of osteoporosis.

Materials & methods: We collect text data including CC and PI from EMR, stored in the hospital information system (HIS) Data Warehouse of one tertiary teaching hospital (1,300 beds) in the western coastal area of central Taiwan, 2014-2016. The average size of the text data stored in this warehouse per year was about 1.2GB. Information extraction from sensitive data and expert assessment methods were employed to determine phrases for key term matching. In order to make sure the data could be sensibly used in the study, we formed an expert team for expert meetings. The tasks of the team were to supervise and understand the use of the data to make sure the use was reasonable and sensitive information was not exposed, and, furthermore, to provide professional medical assistance and consulting services. A team of experts context-preserving word cloud for CC and PI was formed to determine phrases for key term matching. Moreover, a scoring mechanism table was also used. The diagnosis of osteoporosis was defined by the International Classification of Diseases, Ninth Revision (ICD-9) diagnostic codes 733.00 and 733.01. There were 4 phases of the analysis: (1) Phase 1: Unstructured data to atoms, the natural language processing (NLP) techniques were applied. (2) Phase 2: Tokens obtained in phase 1 were sorted according to their frequencies and compared with the text data. (3) Phase 3: A key term scoring table was created. (4) Phase 4: Design of a graphic display of tokens for CC and PI. The analysis was divided into 3 processes of the analysis on osteoporosis to verify the practical value of the method proposed by this study: (1) Process 1: key value selection. (2) Process 2: the score range was {0,1} for the matching with the ICD-9 codes. (3) Process 3: case result status analysis and expert meeting.

Results: This information mining method could identify 20.5% (4,956/ 24,229), 21.7% (5,398/ 24,856), and 20.8% (5,226/ 25,135) potential osteoporosis cases in 2014, 2015, and 2016, respectively. Compared to the cases with ICD-9 codes, only 2.2% (539/24229), 2.4 (598/24856), and 2.8% (697/25135) in these 3 years, respectively.

Conclusion: Our findings suggest that medical information mapping of osteoporosis symptom patterns based on information mining may play a role to identify previously undiagnosed osteoporosis patients.
KEYWORD
FullTexts / Linksout information
Listed journal information
ÇмúÁøÈïÀç´Ü(KCI) KoreaMed